Automatic Genre and Show Identification of Broadcast Media
نویسندگان
چکیده
Huge amounts of digital videos are being produced and broadcast every day, leading to giant media archives. Effective techniques are needed to make such data accessible further. Automatic meta-data labelling of broadcast media is an essential task for multimedia indexing, where it is standard to use multi-modal input for such purposes. This paper describes a novel method for automatic detection of media genre and show identities using acoustic features, textual features or a combination thereof. Furthermore the inclusion of available meta-data, such as time of broadcast, is shown to lead to very high performance. Latent Dirichlet Allocation is used to model both acoustics and text, yielding fixed dimensional representations of media recordings that can then be used in Support Vector Machines based classification. Experiments are conducted on more than 1200 hours of TV broadcasts from the British Broadcasting Corporation (BBC), where the task is to categorise the broadcasts into 8 genres or 133 show identities. On a 200-hour test set, accuracies of 98.6% and 85.7% were achieved for genre and show identification respectively, using a combination of acoustic and textual features with meta-data.
منابع مشابه
Implementing a Characterization of Genre for Automatic Genre Identification of Web Pages
In this paper, we propose an implementable characterization of genre suitable for automatic genre identification of web pages. This characterization is implemented as an inferential model based on a modified version of Bayes’ theorem. Such a model can deal with genre hybridism and individualization, two important forces behind genre evolution. Results show that this approach is effective and is...
متن کاملGenre Categorization and Modeling for Broadcast Speech Transcription
Broadcast News (BN) speech recognition transcription has attracted research due to the challenges of the task since the mid 1990’s. More recently, research has been moving towards more spontaneous broadcast data, commonly called Broadcast Conversation (BC) speech. Considering the large style difference between BN and BC genres, specific modeling of genres should intuitively result in improved s...
متن کاملThe Effect of Broadcast Digitalization on Agricultural Information Dissemination in Nigeria.
Broadcast digitalization with its enormous benefits to the broadcasting industry will improve the quality of content of programs delivered by television stations. Africa has a switchover date of June, 2017. For Nigerians to have access to television broadcast once the switch over is completed, they must purchase high definition television sets or the set-up box. The awareness among urban dwelle...
متن کاملSong-level features and SVMs for music classification
Searching and organizing growing digital music collections requires automatic classification of music. Our system for artist and genre identification uses support vector machines to classify songs based on features calculated over their entire lengths. Since support vector machines are exemplar-based classifiers, training on and classifying entire songs instead of short-time features makes intu...
متن کاملGirth, minimum degree, independence, and broadcast independence
An independent broadcast on a connected graph $G$is a function $f:V(G)to mathbb{N}_0$such that, for every vertex $x$ of $G$, the value $f(x)$ is at most the eccentricity of $x$ in $G$,and $f(x)>0$ implies that $f(y)=0$ for every vertex $y$ of $G$ within distance at most $f(x)$ from $x$.The broadcast independence number $alpha_b(G)$ of $G$is the largest weight $sumlimits_{xin V(G)}f(x)$of an ind...
متن کامل